智能论文笔记

Deep-Learning-Based, Multi-Timescale Load Forecasting in Buildings: Opportunities and Challenges from Research to Deployment

Sakshi Mishra , Stephen M. Frank , Anya Petersen , Robert Buechler , Michelle Slovensky

分类：机器学习

2020-08-12

建筑物和校园的电力负荷预测随着分布式能源（DERs）的渗透而越来越重要。高效的操作和调度DER需要合理准确的未来能耗预测，以便进行现场发电和存储资产的近实时优化派遣。电力公用事业公司传统上对跨越地理区域的负载口袋进行了负荷预测，因此预测不是建筑物和校园运营商的常见做法。鉴于电网交互式高效建筑域中的研究和原型趋势不断发展，超出简单算法预测精度的特点对于确定智能建筑算法的真正效用很重要。其他特性包括部署架构的整体设计和预测系统的运行效率。在这项工作中，我们介绍了一个基于深度学习的负载预测系统，将来预测1小时的时间间隔18小时。我们还讨论了与此类系统的实时部署相关的挑战，以及通过在国家可再生能源实验室智能校园计划中开发的全功能预测系统提供的研究机会。

translated by 谷歌翻译

MAUD: An Expert-Annotated Legal NLP Dataset for Merger Agreement Understanding

Steven H. Wang , Antoine Scardigli , Leonard Tang , Wei Chen , Dimitry Levkin , Anya Chen , Spencer Ball , Thomas Woodside , Oliver Zhang , Dan Hendrycks

分类：自然语言处理

2023-01-02

Reading comprehension of legal text can be a particularly challenging task due to the length and complexity of legal clauses and a shortage of expert-annotated datasets. To address this challenge, we introduce the Merger Agreement Understanding Dataset (MAUD), an expert-annotated reading comprehension dataset based on the American Bar Association's 2021 Public Target Deal Points Study, with over 39,000 examples and over 47,000 total annotations. Our fine-tuned Transformer baselines show promising results, with models performing well above random on most questions. However, on a large subset of questions, there is still room for significant improvement. As the only expert-annotated merger agreement dataset, MAUD is valuable as a benchmark for both the legal profession and the NLP community.

translated by 谷歌翻译

VC dimensions of group convolutional neural networks

Philipp Christian Petersen , Anna Sepliarskaia

分类：机器学习 | (统计)机器学习

2022-12-19

We study the generalization capacity of group convolutional neural networks. We identify precise estimates for the VC dimensions of simple sets of group convolutional neural networks. In particular, we find that for infinite groups and appropriately chosen convolutional kernels, already two-parameter families of convolutional neural networks have an infinite VC dimension, despite being invariant to the action of an infinite group.

translated by 谷歌翻译

Adaptive Sequential Surveillance with Network and Temporal Dependence

Ivana Malenica , Jeremy R. Coyle , Mark J. van der Laan , Maya L. Petersen

分类： (统计)机器学习

2022-12-05

Strategic test allocation plays a major role in the control of both emerging and existing pandemics (e.g., COVID-19, HIV). Widespread testing supports effective epidemic control by (1) reducing transmission via identifying cases, and (2) tracking outbreak dynamics to inform targeted interventions. However, infectious disease surveillance presents unique statistical challenges. For instance, the true outcome of interest - one's positive infectious status, is often a latent variable. In addition, presence of both network and temporal dependence reduces the data to a single observation. As testing entire populations regularly is neither efficient nor feasible, standard approaches to testing recommend simple rule-based testing strategies (e.g., symptom based, contact tracing), without taking into account individual risk. In this work, we study an adaptive sequential design involving n individuals over a period of {\tau} time-steps, which allows for unspecified dependence among individuals and across time. Our causal target parameter is the mean latent outcome we would have obtained after one time-step, if, starting at time t given the observed past, we had carried out a stochastic intervention that maximizes the outcome under a resource constraint. We propose an Online Super Learner for adaptive sequential surveillance that learns the optimal choice of tests strategies over time while adapting to the current state of the outbreak. Relying on a series of working models, the proposed method learns across samples, through time, or both: based on the underlying (unknown) structure in the data. We present an identification result for the latent outcome in terms of the observed data, and demonstrate the superior performance of the proposed strategy in a simulation modeling a residential university environment during the COVID-19 pandemic.

translated by 谷歌翻译

Designing Ecosystems of Intelligence from First Principles

Karl J Friston , Maxwell J D Ramstead , Alex B Kiefer , Alexander Tschantz , Christopher L Buckley , Mahault Albarracin , Riddhi J Pitliya , Conor Heins , Brennan Klein , Beren Millidge

分类：人工智能

2022-12-02

This white paper lays out a vision of research and development in the field of artificial intelligence for the next decade (and beyond). Its denouement is a cyber-physical ecosystem of natural and synthetic sense-making, in which humans are integral participants$\unicode{x2014}$what we call ''shared intelligence''. This vision is premised on active inference, a formulation of adaptive behavior that can be read as a physics of intelligence, and which inherits from the physics of self-organization. In this context, we understand intelligence as the capacity to accumulate evidence for a generative model of one's sensed world$\unicode{x2014}$also known as self-evidencing. Formally, this corresponds to maximizing (Bayesian) model evidence, via belief updating over several scales: i.e., inference, learning, and model selection. Operationally, this self-evidencing can be realized via (variational) message passing or belief propagation on a factor graph. Crucially, active inference foregrounds an existential imperative of intelligent systems; namely, curiosity or the resolution of uncertainty. This same imperative underwrites belief sharing in ensembles of agents, in which certain aspects (i.e., factors) of each agent's generative world model provide a common ground or frame of reference. Active inference plays a foundational role in this ecology of belief sharing$\unicode{x2014}$leading to a formal account of collective intelligence that rests on shared narratives and goals. We also consider the kinds of communication protocols that must be developed to enable such an ecosystem of intelligences and motivate the development of a shared hyper-spatial modeling language and transaction protocol, as a first$\unicode{x2014}$and key$\unicode{x2014}$step towards such an ecology.

translated by 谷歌翻译

Abstract Visual Reasoning with Tangram Shapes

Anya Ji , Noriyuki Kojima , Noah Rush , Alane Suhr , Wai Keen Vong , Robert D. Hawkins , Yoav Artzi

分类：自然语言处理 | 人工智能 | 计算机视觉 | 机器学习

2022-11-29

We introduce KiloGram, a resource for studying abstract visual reasoning in humans and machines. Drawing on the history of tangram puzzles as stimuli in cognitive science, we build a richly annotated dataset that, with >1k distinct stimuli, is orders of magnitude larger and more diverse than prior resources. It is both visually and linguistically richer, moving beyond whole shape descriptions to include segmentation maps and part labels. We use this resource to evaluate the abstract visual reasoning capacities of recent multi-modal models. We observe that pre-trained weights demonstrate limited abstract reasoning, which dramatically improves with fine-tuning. We also observe that explicitly describing parts aids abstract reasoning for both humans and models, especially when jointly encoding the linguistic and visual inputs. KiloGram is available at https://lil.nlp.cornell.edu/kilogram .

translated by 谷歌翻译

Multi-Agent Reinforcement Learning for Adaptive Mesh Refinement

Jiachen Yang , Ketan Mittal , Tarik Dzanic , Socratis Petrides , Brendan Keith , Brenden Petersen , Daniel Faissol , Robert Anderson

分类：机器学习 | 人工智能

2022-11-02

Adaptive mesh refinement (AMR) is necessary for efficient finite element simulations of complex physical phenomenon, as it allocates limited computational budget based on the need for higher or lower resolution, which varies over space and time. We present a novel formulation of AMR as a fully-cooperative Markov game, in which each element is an independent agent who makes refinement and de-refinement choices based on local information. We design a novel deep multi-agent reinforcement learning (MARL) algorithm called Value Decomposition Graph Network (VDGN), which solves the two core challenges that AMR poses for MARL: posthumous credit assignment due to agent creation and deletion, and unstructured observations due to the diversity of mesh geometries. For the first time, we show that MARL enables anticipatory refinement of regions that will encounter complex features at future times, thereby unlocking entirely new regions of the error-cost objective landscape that are inaccessible by traditional methods based on local error estimators. Comprehensive experiments show that VDGN policies significantly outperform error threshold-based policies in global error and cost metrics. We show that learned policies generalize to test problems with physical features, mesh geometries, and longer simulation times that were not seen in training. We also extend VDGN with multi-objective optimization capabilities to find the Pareto front of the tradeoff between cost and error.

translated by 谷歌翻译

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube

R. Abbasi , M. Ackermann , J. Adams , N. Aggarwal , J. A. Aguilar , M. Ahlers , M. Ahrens , J. M. Alameddine , A. A. Alves Jr. , N. M. Amin

分类：机器学习

2022-09-07

ICECUBE是一种用于检测1 GEV和1 PEV之间大气和天体中微子的光学传感器的立方公斤阵列，该阵列已部署1.45 km至2.45 km的南极的冰盖表面以下1.45 km至2.45 km。来自ICE探测器的事件的分类和重建在ICeCube数据分析中起着核心作用。重建和分类事件是一个挑战，这是由于探测器的几何形状，不均匀的散射和冰中光的吸收，并且低于100 GEV的光，每个事件产生的信号光子数量相对较少。为了应对这一挑战，可以将ICECUBE事件表示为点云图形，并将图形神经网络（GNN）作为分类和重建方法。 GNN能够将中微子事件与宇宙射线背景区分开，对不同的中微子事件类型进行分类，并重建沉积的能量，方向和相互作用顶点。基于仿真，我们提供了1-100 GEV能量范围的比较与当前ICECUBE分析中使用的当前最新最大似然技术，包括已知系统不确定性的影响。对于中微子事件分类，与当前的IceCube方法相比，GNN以固定的假阳性速率（FPR）提高了信号效率的18％。另外，GNN在固定信号效率下将FPR的降低超过8（低于半百分比）。对于能源，方向和相互作用顶点的重建，与当前最大似然技术相比，分辨率平均提高了13％-20％。当在GPU上运行时，GNN能够以几乎是2.7 kHz的中位数ICECUBE触发速率的速率处理ICECUBE事件，这打开了在在线搜索瞬态事件中使用低能量中微子的可能性。

translated by 谷歌翻译

Supervised Contrastive Learning to Classify Paranasal Anomalies in the Maxillary Sinus

Debayan Bhattacharya , Benjamin Tobias Becker , Finn Behrendt , Marcel Bengs , Dirk Beyersdorff , Dennis Eggert , Elina Petersen , Florian Jansen , Marvin Petersen , Bastian Cheng

分类：计算机视觉 | 机器学习

2022-09-05

使用深度学习技术，可以在MRI图像中自动检测到旁那鼻鼻窦系统中的异常，并可以根据其体积，形状和其他参数（例如局部对比度）进行进一步分析和分类。但是，由于培训数据有限，传统的监督学习方法通常无法概括。现有的旁那间异常分类中的深度学习方法最多可诊断出一种异常。在我们的工作中，我们考虑三个异常。具体而言，我们采用3D CNN来分离上颌鼻窦体积，而没有异常的鼻窦体积，并具有异常。为了从一个小标记的数据集中学习强大的表示形式，我们提出了一种新颖的学习范式，结合了对比损失和跨内向损失。特别是，我们使用有监督的对比损失，鼓励有或没有异常的上颌窦量的嵌入来形成两个不同的簇，而跨层损失则鼓励3D CNN保持其歧视能力。我们报告说，两种损失的优化是有利的，而不是仅通过一次损失而优化。我们还发现我们的培训策略会提高标签效率。使用我们的方法，3D CNN分类器的AUROC为0.85，而用横向渗透损失优化的3D CNN分类器可实现0.66的AUROC。

translated by 谷歌翻译

Learning with Differentiable Algorithms

Felix Petersen

分类：机器学习

2022-09-01

经典算法和机器学习系统（如神经网络）在日常生活中都很丰富。虽然经典的计算机科学算法适合精确执行精确定义的任务，例如在大图中找到最短路径，但神经网络允许从数据中学习来预测更为复杂的任务中最可能的答案，例如图像分类，无法减少。到确切的算法。为了获得两全其美，本文探讨了将这两个概念结合起来，从而导致更健壮，表现更好，更容易解释，计算效率更高，并且具有更高的数据有效体系结构。该论文正式化了算法监督的想法，该算法可以使神经网络与算法一起学习或结合学习。当将算法集成到神经体系结构中时，重要的是，算法是可区分的，因此可以端对端训练架构，并且可以以有意义的方式通过算法传播梯度。为了使算法可区分，本文提出了一种通过扰动变量并以封闭形式的期望值（即无需采样）近似期望值来连续放松算法的通用方法。此外，本文提出了可区分的算法，例如可区分的排序网络，可区分的渲染器和可区分的逻辑门网络。最后，本文提出了使用算法学习的替代培训策略。

translated by 谷歌翻译